Search CORE

Modeling Symmetric Macromolecular Structures in Rosetta3

Author: A Leaver-Fay
Andrew Leaver-Fay
B Kuhlman
C Wang
David Baker
Frank DiMaio
H Abe
I André
Ingemar André
J Moult
JW Ponder
Leaver-Fay
NA Pierce
NG Sgourakis
P Bradley
Phil Bradley
R Das
Vladimir N. Uversky
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Symmetric protein assemblies play important roles in many biochemical processes. However, the large size of such systems is challenging for traditional structure modeling methods. This paper describes the implementation of a general framework for modeling arbitrary symmetric systems in Rosetta3. We describe the various types of symmetries relevant to the study of protein structure that may be modeled using Rosetta's symmetric framework. We then describe how this symmetric framework is efficiently implemented within Rosetta, which restricts the conformational search space by sampling only symmetric degrees of freedom, and explicitly simulates only a subset of the interacting monomers. Finally, we describe structure prediction and design applications that utilize the Rosetta3 symmetric modeling capabilities, and provide a guide to running simulations on symmetric systems

Lund University Publications

Carolina Digital Repository

RosettaScripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite

Author: A Korkegian
A Leaver-Fay
A Leaver-Fay
A Zanghellini
AA Canutescu
Andrew Leaver-Fay
B Kuhlman
B Kuhlman
B Lee
CA Smith
D Rothlisberger
David Baker
DJ Mandell
Eva-Maria Strauch
Florian Richter
G Guntas
Gordon Lemmon
IW Davis
IW Davis
J Ashworth
J Ashworth
J Meiler
Jacob E. Corn
Jens Meiler
JJ Havranek
JJ Havranek
Justin Ashworth
L Jiang
MD Tyka
Nobuyasu Koga
Paul Murphy
R Das
RA Lerner
S Chaudhury
S Meyers
Sagar D. Khare
Sarel J. Fleishman
SB Thyme
SJ Fleishman
T Kortemme
Vladimir N. Uversky
W Sheffler
WS Sandberg
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Macromolecular modeling and design are increasingly useful in basic research, biotechnology, and teaching. However, the absence of a user-friendly modeling framework that provides access to a wide range of modeling capabilities is hampering the wider adoption of computational methods by non-experts. RosettaScripts is an XML-like language for specifying modeling tasks in the Rosetta framework. RosettaScripts provides access to protocol-level functionalities, such as rigid-body docking and sequence redesign, and allows fast testing and deployment of complex protocols without need for modifying or recompiling the underlying C++ code. We illustrate these capabilities with RosettaScripts protocols for the stabilization of proteins, the generation of computationally constrained libraries for experimental selection of higher-affinity binding proteins, loop remodeling, small-molecule ligand docking, design of ligand-binding proteins, and specificity redesign in DNA-binding proteins

OPUS - University of Technology Sydney

Carolina Digital Repository

A Generic Program for Multistate Protein Design

Some protein design tasks cannot be modeled by the traditional single state design strategy of finding a sequence that is optimal for a single fixed backbone. Such cases require multistate design, where a single sequence is threaded onto multiple backbones (states) and evaluated for its strengths and weaknesses on each backbone. For example, to design a protein that can switch between two specific conformations, it is necessary to to find a sequence that is compatible with both backbone conformations. We present in this paper a generic implementation of multistate design that is suited for a wide range of protein design tasks and demonstrate in silico its capabilities at two design tasks: one of redesigning an obligate homodimer into an obligate heterodimer such that the new monomers would not homodimerize, and one of redesigning a promiscuous interface to bind to only a single partner and to no longer bind the rest of its partners. Both tasks contained negative design in that multistate design was asked to find sequences that would produce high energies for several of the states being modeled. Success at negative design was assessed by computationally redocking the undesired protein-pair interactions; we found that multistate design's accuracy improved as the diversity of conformations for the undesired protein-pair interactions increased. The paper concludes with a discussion of the pitfalls of negative design, which has proven considerably more challenging than positive design

Carolina Digital Repository

Modeling Disordered Regions in Proteins Using Rosetta

Author: A Leaver-Fay
A Zemla
AK Dunker
CA Rohl
CJ Oldfield
David Baker
E Alm
HJ Dyson
JJ Ward
Kristina Krassovsky
Michael Tyka
MV Berjanskii
Ray Yu-Ruei Wang
Vladimir N. Uversky
William Sheffler
Y Shen
Y Shen
Yan Han
Publication venue: Public Library of Science
Publication date: 29/07/2011
Field of study

Protein structure prediction methods such as Rosetta search for the lowest energy conformation of the polypeptide chain. However, the experimentally observed native state is at a minimum of the free energy, rather than the energy. The neglect of the missing configurational entropy contribution to the free energy can be partially justified by the assumption that the entropies of alternative folded states, while very much less than unfolded states, are not too different from one another, and hence can be to a first approximation neglected when searching for the lowest free energy state. The shortcomings of current structure prediction methods may be due in part to the breakdown of this assumption. Particularly problematic are proteins with significant disordered regions which do not populate single low energy conformations even in the native state. We describe two approaches within the Rosetta structure modeling methodology for treating such regions. The first does not require advance knowledge of the regions likely to be disordered; instead these are identified by minimizing a simple free energy function used previously to model protein folding landscapes and transition states. In this model, residues can be either completely ordered or completely disordered; they are considered disordered if the gain in entropy outweighs the loss of favorable energetic interactions with the rest of the protein chain. The second approach requires identification in advance of the disordered regions either from sequence alone using for example the DISOPRED server or from experimental data such as NMR chemical shifts. During Rosetta structure prediction calculations the disordered regions make only unfavorable repulsive contributions to the total energy. We find that the second approach has greater practical utility and illustrate this with examples from de novo structure prediction, NMR structure calculation, and comparative modeling

Predicting the Tolerated Sequences for Proteins and Protein Interfaces Using RosettaBackrub Flexible Backbone Design

Author: A Ernst
A Leaver-Fay
A Leaver-Fay
AE Sauer-Eriksson
B Kuhlman
B Kuhlman
CA Rohl
CA Smith
CA Smith
CA Voigt
Colin A. Smith
CT Saunders
DJ Mandell
DM Fowler
EL Humphris
EL Humphris
F Ding
G Fuh
G Pál
GD Friedland
GD Friedland
GD Friedland
GP Smith
HL Schmidt
I Georgiev
I Georgiev
I Georgiev
IW Davis
JD Bloom
JD Kotz
JJ Havranek
JR Desjarlais
KM Frey
MD Distefano
N Metropolis
N Ollikainen
N Pokala
NJ Marini
PB Harbury
R Tonikian
RL Dunbrack
RP Laura
SM Larson
T Clackson
T Kortemme
Tanja Kortemme
TP Treynor
Vladimir N. Uversky
X Fu
X Hu
XI Ambroggio
XI Ambroggio
Publication venue: Public Library of Science
Publication date: 18/07/2011
Field of study

Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others

arXiv.org e-Print Archive

Four small puzzles that Rosetta doesn't solve

Author: A Leaver-Fay
A Onufriev
AG Cochran
Ashley M. Buckle
B Barua
B Kuhlman
B Qian
BM Olivera
C Simmerling
CA McPhalen
CA McPhalen
CA Rohl
CC Correll
CJ McKnight
CR Woese
D Chivian
D Rothlisberger
DJ Mandell
E Ennifar
F DiMaio
IK McDonald
JK Lassila
JM Blose
JW Neidigh
JW Pitera
KM Misura
KT Simons
KT Simons
L Jiang
LW Guddat
M Dauplais
O Schueler-Furman
P Bradley
P Brion
P Eastman
P Ren
R Bonneau
R Bonneau
R Das
R Das
R Das
R Das
R Das
R Das
R Kratzner
R Zhou
Rhiju Das
S Raman
S Raman
T Kortemme
XI Ambroggio
Y Shen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

A complete macromolecule modeling package must be able to solve the simplest structure prediction problems. Despite recent successes in high resolution structure modeling and design, the Rosetta software suite fares poorly on deceptively small protein and RNA puzzles, some as small as four residues. To illustrate these problems, this manuscript presents extensive Rosetta results for four well-defined test cases: the 20-residue mini-protein Trp cage, an even smaller disulfide-stabilized conotoxin, the reactive loop of a serine protease inhibitor, and a UUCG RNA tetraloop. In contrast to previous Rosetta studies, several lines of evidence indicate that conformational sampling is not the major bottleneck in modeling these small systems. Instead, approximations and omissions in the Rosetta all-atom energy function currently preclude discriminating experimentally observed conformations from de novo models at atomic resolution. These molecular "puzzles" should serve as useful model systems for developers wishing to make foundational improvements to this powerful modeling suite.Comment: Published in PLoS One as a manuscript for the RosettaCon 2010 Special Collectio

Real-Time PyMOL Visualization for Rosetta and PyRosetta

Author: A Leaver-Fay
BR Brooks
Brian D. Weitzner
DC Richardson
Evan H. Baugh
F Frigerioa
GT Johnson
Jeffrey J. Gray
KW Kaufmann
N Guex
R Das
RA Sayle
S Chaudhury
S Cooper
Sergey Lyskov
SI O'Donoghue
SJ Fleishman
T Schwede
Vladimir N. Uversky
W Humphrey
WL DeLano
Y Wang
Publication venue: Public Library of Science
Publication date: 16/08/2011
Field of study

Computational structure prediction and design of proteins and protein-protein complexes have long been inaccessible to those not directly involved in the field. A key missing component has been the ability to visualize the progress of calculations to better understand them. Rosetta is one simulation suite that would benefit from a robust real-time visualization solution. Several tools exist for the sole purpose of visualizing biomolecules; one of the most popular tools, PyMOL (Schrödinger), is a powerful, highly extensible, user friendly, and attractive package. Integrating Rosetta and PyMOL directly has many technical and logistical obstacles inhibiting usage. To circumvent these issues, we developed a novel solution based on transmitting biomolecular structure and energy information via UDP sockets. Rosetta and PyMOL run as separate processes, thereby avoiding many technical obstacles while visualizing information on-demand in real-time. When Rosetta detects changes in the structure of a protein, new coordinates are sent over a UDP network socket to a PyMOL instance running a UDP socket listener. PyMOL then interprets and displays the molecule. This implementation also allows remote execution of Rosetta. When combined with PyRosetta, this visualization solution provides an interactive environment for protein structure prediction and design

Transient Protein-Protein Interaction of the SH3-Peptide Complex via Closely Located Multiple Binding Sites

Author: A Angers
A Diaz-Ortiz
A Diaz-Ortiz
A Kramer
A Leaver-Fay
A Spaar
A Zarrinpar
A Zarrinpar
BJ Mayer
BK Kay
C Landgraf
C Tang
Charlotte M. Deane
CY Jia
DA Case
DC Dalgarno
DJ Owen
Dongsup Kim
F Evanics
G Cesareni
G Cestra
G Grigoryan
G Schreiber
I Letunic
J Janin
JA Marles
JE Ladbury
JR Apgar
L Wunderlich
M Ahmad
M Hiipakka
MK Gilson
N Eswar
NL Fawzi
R Frank
R Frank
R Guerois
S Hahn
S Hong
Seungsoo Hahn
SS Li
T Hou
T Hou
T Hou
T Saitoh
TE Smithgall
X Wu
Y Duan
Publication venue: Public Library of Science
Publication date: 22/03/2012
Field of study

Protein-protein interactions play an essential role in cellular processes. Certain proteins form stable complexes with their partner proteins, whereas others function by forming transient complexes. The conventional protein-protein interaction model describes an interaction between two proteins under the assumption that a protein binds to its partner protein through a single binding site. In this study, we improved the conventional interaction model by developing a Multiple-Site (MS) model in which a protein binds to its partner protein through closely located multiple binding sites on a surface of the partner protein by transiently docking at each binding site with individual binding free energies. To test this model, we used the protein-protein interaction mediated by Src homology 3 (SH3) domains. SH3 domains recognize their partners via a weak, transient interaction and are therefore promiscuous in nature. Because the MS model requires large amounts of data compared with the conventional interaction model, we used experimental data from the positionally addressable syntheses of peptides on cellulose membranes (SPOT-synthesis) technique. From the analysis of the experimental data, individual binding free energies for each binding site of peptides were extracted. A comparison of the individual binding free energies from the analysis with those from atomistic force fields gave a correlation coefficient of 0.66. Furthermore, application of the MS model to 10 SH3 domains lowers the prediction error by up to 9% compared with the conventional interaction model. This improvement in prediction originates from a more realistic description of complex formation than the conventional interaction model. The results suggested that, in many cases, SH3 domains increased the protein complex population through multiple binding sites of their partner proteins. Our study indicates that the consideration of general complex formation is important for the accurate description of protein complex formation, and especially for those of weak or transient protein complexes

FigShare

Benchmarking and Analysis of Protein Docking Performance in Rosetta v3.2

Author: A Leaver-Fay
A Sircar
A Sircar
A Sivasubramanian
A Sivasubramanian
B Efron
B Pierce
B Raveh
Brian D. Weitzner
C Dominguez
C Wang
C Wang
EH Baugh
H Hwang
Hannah Bergman
HM Berman
I Andre
IW Davis
J Mintseris
JA Hermoso
Jeffrey J. Gray
JJ Gray
K Wiehe
KT Simons
M Berrondo
MD Daily
MF Lensink
MF Lensink
Monica Berrondo
Pravin Muthu
R Mendez
R Morales
R Méndez
S Chaudhury
S Chaudhury
S Chaudhury
S Chaudhury
Sidhartha Chaudhury
SJ Fleishman
SJ Fleishman
SR Comeau
Vladimir N. Uversky
WE Bocik
Publication venue: Public Library of Science
Publication date: 02/08/2011
Field of study

RosettaDock has been increasingly used in protein docking and design strategies in order to predict the structure of protein-protein interfaces. Here we test capabilities of RosettaDock 3.2, part of the newly developed Rosetta v3.2 modeling suite, against Docking Benchmark 3.0, and compare it with RosettaDock v2.3, the latest version of the previous Rosetta software package. The benchmark contains a diverse set of 116 docking targets including 22 antibody-antigen complexes, 33 enzyme-inhibitor complexes, and 60 ‘other’ complexes. These targets were further classified by expected docking difficulty into 84 rigid-body targets, 17 medium targets, and 14 difficult targets. We carried out local docking perturbations for each target, using the unbound structures when available, in both RosettaDock v2.3 and v3.2. Overall the performances of RosettaDock v2.3 and v3.2 were similar. RosettaDock v3.2 achieved 56 docking funnels, compared to 49 in v2.3. A breakdown of docking performance by protein complex type shows that RosettaDock v3.2 achieved docking funnels for 63% of antibody-antigen targets, 62% of enzyme-inhibitor targets, and 35% of ‘other’ targets. In terms of docking difficulty, RosettaDock v3.2 achieved funnels for 58% of rigid-body targets, 30% of medium targets, and 14% of difficult targets. For targets that failed, we carry out additional analyses to identify the cause of failure, which showed that binding-induced backbone conformation changes account for a majority of failures. We also present a bootstrap statistical analysis that quantifies the reliability of the stochastic docking results. Finally, we demonstrate the additional functionality available in RosettaDock v3.2 by incorporating small-molecules and non-protein co-factors in docking of a smaller target set. This study marks the most extensive benchmarking of the RosettaDock module to date and establishes a baseline for future research in protein interface modeling and structure prediction